A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning
نویسندگان
چکیده
Understanding the root molecular and genetic causes driving complex traits is a fundamental challenge in genomics and genetics. Numerous studies have used variation in gene expression to understand complex traits, but the underlying genomic variation that contributes to these expression changes is not well understood. In this study, we developed a framework to integrate gene expression and genotype data to identify biological differences between samples from opposing complex trait classes that are driven by expression changes and genotypic variation. This framework utilizes pathway analysis and multi-task learning to build a predictive model and discover pathways relevant to the complex trait of interest. We simulated expression and genotype data to test the predictive ability of our framework and to measure how well it uncovered pathways with genes both differentially expressed and genetically associated with a complex trait. We found that the predictive performance of the multi-task model was comparable to other similar methods. Also, methods like multi-task learning that considered enrichment analysis scores from both data sets found pathways with both genetic and expression differences related to the phenotype. We used our framework to analyze differences between estrogen receptor (ER) positive and negative breast cancer samples. An analysis of the top 15 gene sets from the multi-task model showed they were all related to estrogen, steroids, cell signaling, or the cell cycle. Although our study suggests that multi-task learning does not enhance predictive accuracy, the models generated by our framework do provide valuable biological pathway knowledge for complex traits.
منابع مشابه
Designing a Conceptual Framework for Integrating Components of Professional Ethics in a Ceramic Curriculum
Background: Teaching professional ethics in the ceramics branch requires using a standard system of integrating professional ethics components in the ceramics curriculum elements to determine the relationship between professional ethics and the curriculum components. The aim of the present study is a conceptual framework for integrating the elements of professional ethics in the ceramic’s curri...
متن کاملAn Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding
Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across mul...
متن کاملConceptGen: a gene set enrichment and gene set relation mapping tool
MOTIVATION The elucidation of biological concepts enriched with differentially expressed genes has become an integral part of the analysis and interpretation of genomic data. Of additional importance is the ability to explore networks of relationships among previously defined biological concepts from diverse information sources, and to explore results visually from multiple perspectives. Accomp...
متن کاملSystematically Differentiating Functions for Alternatively Spliced Isoforms through Integrating RNA-seq Data
Integrating large-scale functional genomic data has significantly accelerated our understanding of gene functions. However, no algorithm has been developed to differentiate functions for isoforms of the same gene using high-throughput genomic data. This is because standard supervised learning requires 'ground-truth' functional annotations, which are lacking at the isoform level. To address this...
متن کاملTASK-BASED CURRICULUM FROM THE NURSING EDUCATION EXPERTS' VIEWPOINT: A PHENOMENOLOGICAL STUDY
Background & Aims: One of the new educational approaches adopted by many medical schools around the world as a suitable method for teaching and learning is the task-based curriculum developed by Harden et al. (1996) in the medical education curricula. The purpose of the present study was to identify the characteristics and methods of teaching and learning task-based curriculum in nursing educat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 7 شماره
صفحات -
تاریخ انتشار 2012